Data Representation and Compression Using Linear-Programming Approximations

نویسندگان

  • Hristo S. Paskov
  • John C. Mitchell
  • Trevor J. Hastie
چکیده

We propose ‘Dracula’, a new framework for unsupervised feature selection from sequential data such as text. Dracula learns a dictionary of n-grams that efficiently compresses a given corpus and recursively compresses its own dictionary; in effect, Dracula is a ‘deep’ extension of Compressive Feature Learning. It requires solving a binary linear program that may be relaxed to a linear program. Both problems exhibit considerable structure, their solution paths are well behaved, and we identify parameters which control the depth and diversity of the dictionary. We also discuss how to derive features from the compressed documents and show that while certain unregularized linear models are invariant to the structure of the compressed dictionary, this structure may be used to regularize learning. Experiments are presented that demonstrate the efficacy of Dracula’s features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Planelet Transform: A New Geometrical Wavelet for Compression of Kinect-like Depth Images

With the advent of cheap indoor RGB-D sensors, proper representation of piecewise planar depth images is crucial toward an effective compression method. Although there exist geometrical wavelets for optimal representation of piecewise constant and piecewise linear images (i.e. wedgelets and platelets), an adaptation to piecewise linear fractional functions which correspond to depth variation ov...

متن کامل

Non-Linear Image Representation Based on IDP with NN

In this paper is offered a method for non-linear still image representation based on pyramidal decomposition with a neural network. This approach is developed by analogy with the hypothesis for the way humans do image recognition using consecutive approximations with increasing similarity. A hierarchical decomposition, named Inverse Difference Pyramid (IDP), is used for the image representation...

متن کامل

Target setting in the process of merging and restructuring of decision-making units using multiple objective linear programming

This paper presents a novel approach to achieving the goals of data envelopment analysis in the process of reconstruction and integration of decision-making units by using multiple objective linear programming. In this regard, first, we review inverse data envelopment analysis models for data reconstruction and integration. We present a model with multi-objective linear programming structure in...

متن کامل

Close interval approximation of piecewise quadratic fuzzy numbers for fuzzy fractional program

  The fuzzy approach has undergone a profound structural transformation in the past few decades. Numerous studies have been undertaken to explain fuzzy approach for linear and nonlinear programs. While, the findings in earlier studies have been conflicting, recent studies of competitive situations indicate that fractional programming problem has a positive impact on comparative scenario. We pro...

متن کامل

A new solving approach for fuzzy multi-objective programming problem in uncertainty conditions by ‎using semi-infinite linear programing

In practice, there are many problems which decision parameters are fuzzy numbers, and some kind of this problems are formulated as either possibilitic programming or multi-objective programming methods. In this paper, we consider a multi-objective programming problem with fuzzy data in constraints and introduce a new approach for solving these problems base on a combination of the multi-objecti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1511.06606  شماره 

صفحات  -

تاریخ انتشار 2015